There are four datasets in the Midline data folder:

i) HG_HW
	a) This is the household level dataset
	b) Observations in this dataset are uniquely identified by concatenating district code (hh_103_1), block code (hh_105_1), Gram Panchayat Code (hh_107_1), village code (hh_109_1), sahi code (hh_111_1) and household number (hh_101_1)
		b1) In order to generate a twelve digit unique identifier, string these codes and prefix zero to them whenever the codes take value less than 10 and concantenate
	c) There should be a total of 2947 household observations 
		c1) There are 2947 housheolds in the dataset
		c2) 47 households were additionally covered in the midline and would not exist in baseline data.
		c3) 2 households from baseline are not found in the midline. The details of these have been specified below:
			091801020114 & 091801020214 are the same hh (same identifiers) so the latter is not there in the midline
			050904020305 has been merged with 050904020304 because the two brothers in these families started living together by the time of the midline
		c3) The variable 'overlap' marks the observations in points c2) and c3) above: 1=HH only in ML, 2=HH only in BL, 3=HH in both BL and ML.
	d) The variable HH_HGID is the baseline ID corresponding to a HH
	e) Due to mismatching between sample and midline HH identifiers 70 observations could not be traced between baseline and midline raw data
		d1) Appropriate corrections have been made in the codes using the method described in file titled 'Methodology for HH_HW ID Correction)
	f) In 42 observations in the raw data the respondant for Woman's questionnaire was flagged to be never married/male. Inspection of the data revealed two reasons for the same:
		f1) The gender of the HW respondant was entered wrong
		f2) The ID code of the HW respondant was entered wrong
		f3) Both these problems have been corrected and the corresponding .do file is 'Cleaning IDs of HW respondants'
	g) For mapping between HH level data and village level data following points need to be accounted for:
		g1) Household Id 061103020207 should be mapped to village ID 06110301
		g2) For the following 15 Household IDs there is ambiguity in village IDs as a result of which mapping to Village level data sets is not possible:
			081502110101			081502110102			081502110103
			081502110104			081502110105			081502110106
			081502110107			081502110108			081502110109
			081502110110			081502110111			081502110112
			081502110113			081502110114			081502110115

	h) Questionnaires for this dataset are 'Final Bilingual HH Questionnaire General 1st July 2014' and 'Final Bilingual HH Questionnaire Women 1st July 2014' (see  folder Midline_Final Questionnaires)

ii) VG
	a) This is the village level dataset
	b) There should be a total of 193 villages
	c) Observations in this dataset are uniquely identified by the code generated by concatenation of district code (vg_b2_1), block code (vg_c2_1), Gram Panchayat Code (vg_d2_1) and vilage code (vg_e2_1)
		c1)These codes (vg_b2_1, vg_c2_1, vg_d2_1, vg_e2_1) will have to be prefixed with a zero if their value is less than 10 to generate this unique code
	d) Questionnaire for this dataset is 'Final Bilingual Village Questionnaire General -4th July 2014' (see  folder Midline_Final Questionnaires)
		

iii) VW
	a) This is the village level FGD administered to a female only group
	b) FGDs for the following villages were supposed to be adminsitered only to an SC/ST group:
		01010101		01010302		01010402		01020301
		01020302		01020401		02030101		02030201
		02030301		02030402		02040101		02040401
		02040601		03050201		03050401		03050602
		03060102		03060202		03060402		04070102
		04070301		04070501		04080201		04080301
		04080302		05090102		05090201		05090401
		05100401		05100402		05100501		06110101
		06110202		06110302		06110402		06120202
		06120402		06120602		07130201		07130302
		07130407		07140202		07140401		07140402
		08150201		08150301		08150302		08150401
		08150501		08160201		08160202		08160402
		09170101		09170301		09170302		09180201
		09180401		09180601		09180602		10190201
		10190301		10190302		10200102		10200201
		10200202

	c) There should be a total of 193 villages
	d) Observations in this dataset are uniquely identified by the code generated by concatenation of district code (vw_c2_1), block code (vw_d2_1), Gram Panchayat Code (vw_e2_1) and vilage code (vw_f2_1)
		c1)These codes (vw_c2_1, vw_d2_1, vw_e2_1, vw_f2_1) will have to be prefixed with a zero if their value is less than 10 to generate this unique code
	e) Questionnaire for this dataset is 'Final Bilingual Village Questionnaire women -4th July 2014' (see  folder Midline_Final Questionnaires)
	
	
iv) GPLF
	a) This is a Gram Panchayat level dataset 
	b) This is a midline specific dataset which was administered only in project GPs (hence does not exist for baseline)
	c) There should be total of 59 observations in this dataset as per the sample
		c1) GP ID 091801 is missing due to non-existence of the corresponding GP (hence 58 obervations in the data set)
	d) Observations in this dataset are uniquely identified by the code generated by concatenation of district code (gp_b2_1), block code (gp_c2_1) and Gram Panchayat Code (gp_d2_1)
		d1)These codes (gp_b2_1, gp_c2_1, gp_d2_1) will have to be prefixed with a zero if their value is less than 10 to generate this unique code
	e) Questionnaire for this dataset is 'Final_GPLF_5th July' (see  folder Midline_Final Questionnaires)
